Simulating Planners’ Interactions With the Treatment Planning System: A Reinforcement Learning Study for Pancreas SBRT Planning
نویسندگان
چکیده
منابع مشابه
Reinforcement Planning: Planners as Policies
Introduction. State-of-the-art robotic systems [1, 2, 3] increasingly rely on search-based planning or optimal control methods to guide decision making. Similar observations can be made about computer game engines. Such methods are nearly always extremely crude approximations to the reality encountered by the robot: they consider a simplified model of the robot (as a point, or a “flying brick”)...
متن کاملLocomotion Planning with 3D Character Animations by Combining Reinforcement Learning Based and Fuzzy Motion Planners
Motion and locomotion planning have a wide area of usage in different fields. Locomotion planning with premade character animations has been highly noticed in recent years. Reinforcement Learning presents promising ways to create motion planners using premade character animations. Although RL-based motion planners offer great ways to control character animations but they have some problems that...
متن کاملRTP-Q: A Reinforcement Learning System with Time Constraints Exploration Planning for Accelerating the Learning Rate
Reinforcement learning is an efficient method for solving Markov Decision Processes that an agent improves its performance by using scalar reward values with higher capability of reactive and adaptive behaviors. Q-learning is a representative reinforcement learning method which is guaranteed to obtain an optimal policy but needs numerous trials to achieve it. k-Certainty Exploration Learning Sy...
متن کاملPlanning with neural networks and reinforcement learning
planning with neural networks, time limits of discounted reinforcement learning Planning, taskability, Dyna-PI architectures Dyna-PI architectures: focussing, forward and backward planning, acting and (re)planning. Tested with... Ideas from problem solving and
متن کاملCombining Reinforcement Learning with Symbolic Planning
One of the major difficulties in applying Q-learning to realworld domains is the sharp increase in the number of learning steps required to converge towards an optimal policy as the size of the state space is increased. In this paper we propose a method, PLANQ-learning, that couples a Q-learner with a STRIPS planner. The planner shapes the reward function, and thus guides the Q-learner quickly ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: International Journal of Radiation Oncology*Biology*Physics
سال: 2020
ISSN: 0360-3016
DOI: 10.1016/j.ijrobp.2020.07.615